## Glimpse of the Box Experiment dataset:
## Rows: 2,795
## Columns: 20
## $ Date <dttm> 2022-09-27, 2022-09-27, 2022-09-27, 2022-09-27,…
## $ Time <dttm> 1899-12-31 09:47:50, 1899-12-31 09:50:07, 1899-…
## $ Data <chr> "Box Experiment", "Box Experiment", "Box Experim…
## $ Group <chr> "Baie Dankie", "Baie Dankie", "Baie Dankie", "Ba…
## $ GPSS <chr> "-28.010549999999999", "-28.010549999999999", "-…
## $ GPSE <chr> "31.191050000000001", "31.191050000000001", "31.…
## $ MaleID <chr> "Nge", "Nge", "Nge", "Nge", "Nge", "Nge", "Nge",…
## $ FemaleID <chr> "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", …
## $ `Male placement corn` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ MaleCorn <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
## $ FemaleCorn <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ DyadDistance <chr> "2m", "2m", "1m", "1m", "0m", "0m", "0m", "0m", …
## $ DyadResponse <chr> "Tolerance", "Tolerance", "Tolerance", "Toleranc…
## $ OtherResponse <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ Audience <chr> "Obse; Oup; Sirk", "Obse; Oup; Sirk", "Oup; Sirk…
## $ IDIndividual1 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ IntruderID <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Sey…
## $ Remarks <chr> NA, NA, "Nge box did not open because of the bat…
## $ Observers <chr> "Josefien; Michael; Ona; Zonke", "Josefien; Mich…
## $ DeviceId <chr> "{7A4E6639-7387-7648-88EC-7FD27A0F258A}", "{7A4E…
I am now using the View function to have a sight on the entire dataset and glimpse to display a summary of my dataset
I have 20 variables (here columns) and 2795 trials (here rows)
I will now make a brief summary of each variables and their use before creating a new dataframe (df) with my variables of interest that I will call Bex
The highlighted variables are the ones I will use for Bex. I will then clean the data before heading to the statistical analysis and the interpretation of the results
Date : “Date” is in a POSIXct format which is appropriate for the display of time
Time : “Time is coded” in a POSIXct format
Data : chr “Data” is coded as character
Group : chr The data is coded in r as a character
GPSS : num “GPSS” is coded as numerical
GPSE : num “GPSE” is coded in as numerical
MaleID : chr “MaleID” is coded as character
FemaleID : chr “FemaleID” is coded as character
Male placement corn: dbl “Male placement corn is coded in r as double
It gives the amount of corn given to the male of the dyad before the trials
Within a session it happened that we gave more placement corn to attract the monkeys again to the boxes. This lead to an update of the number in the same session. The number found at the end of the session is the total placement corn an individual has received
I will fuse this column with male corn as the data has been separated between these two variables. This is due to a mistake when creating the original box experiment form in cybertracker
This variable could be related to the level of motivation of a monkey but as it is not directly related to my hypothesis I may not use this column. I will re-consider the use of this column later on
In regards of this possibility I will change the format of the variable to numerical
MaleCorn : dbl “MaleCorn” is coded in r as double
FemaleCorn : dbl The data is coded in r as double
DyadDistance : chr The data is coded in r as character
DyadResponse : chr The data is coded in r as character
Create a table with each combination existing
Decide what is more important
Ex:
OtherResponse : chr “The data”OtherResponse” is coded as character
Audience : chr “Audience” is in r as character
IDIndividual1 : chr “IDIndividual1” is coded in r as character
IntruderID : chr “IndtruderID” is coded as character
Remarks : chr The data is coded in r as character
Observers :chr The data is coded in r as character
DeviceID :chr “The data”DeviceID” is coded in r as character
Since I do not want to work with the whole dataset, I’m gonna select the variables of interest using the function select
I will keep Time, Date, Group, MaleID, FemaleID, MaleCorn, Male placement corn, FemaleCorn, DyadDistance, DyadResponse, OtherResponse, Audience, IDIndividual1, IntruderID, Remarks
## Rows: 2,795
## Columns: 15
## $ Time <dttm> 1899-12-31 09:47:50, 1899-12-31 09:50:07, 1899-…
## $ Date <dttm> 2022-09-27, 2022-09-27, 2022-09-27, 2022-09-27,…
## $ Group <chr> "Baie Dankie", "Baie Dankie", "Baie Dankie", "Ba…
## $ MaleID <chr> "Nge", "Nge", "Nge", "Nge", "Nge", "Nge", "Nge",…
## $ FemaleID <chr> "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", …
## $ MaleCorn <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
## $ `Male placement corn` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ FemaleCorn <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ DyadDistance <chr> "2m", "2m", "1m", "1m", "0m", "0m", "0m", "0m", …
## $ DyadResponse <chr> "Tolerance", "Tolerance", "Tolerance", "Toleranc…
## $ OtherResponse <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ Audience <chr> "Obse; Oup; Sirk", "Obse; Oup; Sirk", "Oup; Sirk…
## $ IDIndividual1 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ IntruderID <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Sey…
## $ Remarks <chr> NA, NA, "Nge box did not open because of the bat…
## Number of rows with common NAs in MaleCornOld and 'Male placement corn': 1499
## Number of occurrences of 0 in MaleCorn: 1499
## Number of remaining NA values in MaleCorn: 0
I have found 1499 NA in common between MaleCornOld and ‘male placement corn’, 1609 NA in Male placement corn and 2685 in MaleCorn old
For the merge of MaleCornOld and Male placement corn, I used different conditions: 1.In this code, a new variable MaleCorn is created. If there is a missing value in Male placement corn, it takes the corresponding value from MaleCornOld; otherwise, it takes the value from Male placementcorn. 2.If there are no value in both MaleCornOld and Male placement corn (NA,NA) for a given row, I would like the code to display 0 as it means that no placement was given
In this way, I should not loose any data, minimize the mistakes and already transform the NA’s of this variable into a number which will remove the remaining NA’s which are meant to be 0
After the merge I found that there were no NA’s remaining in the “New” Male Corn and that 1499 0’s where found in the column which corresponds to the amount of common NA’s found previously between the “Old” Male Corn and male placement corn
## Number of remaining NA values in FemaleCorn: 0
Now in order to see where are located the missing points in the data, I’m going to print the variables with and without NA’s
The function sapply is used to apply the function sum for NA’s to each column of the data frame, so each variable
## Variables with Missing Data:
| x | |
|---|---|
| MaleID | 19 |
| FemaleID | 60 |
| DyadDistance | 33 |
| DyadResponse | 47 |
| OtherResponse | 2758 |
| Audience | 924 |
| IDIndividual1 | 2143 |
| IntruderID | 2737 |
| Remarks | 2181 |
## Variables with No Missing Data:
| x | |
|---|---|
| Time | 0 |
| Date | 0 |
| Group | 0 |
| FemaleCorn | 0 |
| MaleCorn | 0 |
We can see that out of the 14 variables we have in Bex we have 9 variables with missing data which are Male ID, Female ID, DyadDistance, DyadResponse, OtherResponse, Audience, IDIndividual1, IntruderID, Remarks: I will proceed to clean these variables one by one
MaleID 19
FemaleID 60
DyadDistance 33
DyadResponse 47
OtherResponse 2758
Audience 924
ID Individual1 2143
IntruderID 2737
Remarks 2181
Before making treating the NA’s in the dataset I will make a backup of the data at this point:
Since most of the time we did not have any remarks it is understandable that this variable contains 2181 NA’s out of 2795 rows
I will first transform every missing data in the column Remark into No Remarks and then check that the amount of “No remarks” found
After the changes we can effectively see that we have 2181 “No Remarks” and we have no missing data left in that column, I will treat this column by hand once all the NA’s have been removed from the dataset
## Number of 'No Remarks' in the 'Remarks' column: 2181
##
## No Remarks Remarks
## 2181 614
## Number of 'No Intrusion' in the 'Intruder ID' column after replacement: 2737
## Number of NAs replaced in IDIndividual1: 2143
## Number of remaining NA values in IDIndividual1: 0
## Number of changes made in 'Audience': 924
## Remaining NA values in 'Audience': 0
## Number of changes made in 'OtherResponse': 2758
## Remaining NA values in 'OtherResponse': 0
## [1] "1899-12-31 09:47:50 UTC" "1899-12-31 09:50:07 UTC"
## [3] "1899-12-31 09:53:11 UTC" "1899-12-31 09:54:28 UTC"
## [5] "1899-12-31 09:55:19 UTC" "1899-12-31 09:56:56 UTC"
## [1] "09:47:50" "09:50:07" "09:53:11" "09:54:28" "09:55:19" "09:56:56"
## Warning: NAs introduits lors de la conversion automatique
## # A tibble: 69 × 16
## Time Date Group MaleID FemaleID FemaleCorn DyadDistance
## <chr> <dttm> <chr> <chr> <chr> <dbl> <dbl>
## 1 12:09:34 2022-09-27 00:00:00 Baie Da… Xia Piep 7 NA
## 2 12:13:28 2022-09-27 00:00:00 Baie Da… Xia Piep 7 NA
## 3 16:02:32 2022-09-15 00:00:00 Ankhase Sho Ginq 6 NA
## 4 10:46:33 2023-08-17 00:00:00 Baie Da… Xia Piep 0 NA
## 5 09:30:17 2023-07-29 00:00:00 Baie Da… Xin Ouli 0 NA
## 6 12:08:51 2023-07-11 00:00:00 Baie Da… Xia Piep 0 NA
## 7 13:30:07 2023-06-29 00:00:00 Baie Da… Sey Sirk 0 NA
## 8 09:54:24 2023-06-27 00:00:00 Ankhase Sho Ginq 0 NA
## 9 10:13:56 2023-06-23 00:00:00 Ankhase Sho Ginq 0 NA
## 10 09:39:04 2023-06-15 00:00:00 Ankhase Sho Ginq 2 NA
## # ℹ 59 more rows
## # ℹ 9 more variables: DyadResponse <chr>, OtherResponse <chr>, Audience <chr>,
## # IDIndividual1 <chr>, IntruderID <chr>, Remarks <chr>, MaleCorn <dbl>,
## # Intrusion <dbl>, AmountAudience <dbl>
## Number of NA values in DyadDistance column (using second approach): 69
## Rows with NA values in DyadDistance column: 24, 27, 95, 492, 744, 971, 1113, 1130, 1164, 1261, 1341, 1396, 1491, 1583, 1683, 1693, 1717, 1718, 1719, 1724, 1725, 1739, 1755, 1756, 1757, 1764, 1779, 1782, 1792, 1799, 1800, 1840, 1841, 1868, 1869, 1888, 1891, 1892, 1896, 1911, 1912, 1915, 1918, 1919, 1952, 1953, 1958, 1980, 1981, 1984, 1986, 1996, 2000, 2009, 2054, 2104, 2105, 2191, 2233, 2234, 2287, 2437, 2569, 2579, 2580, 2643, 2676, 2709, 2729
We have 69 missing values in DyadDistance. I will look at each row in it’s context as the actual distance of the box was always dependent of the previous trials. I will start with the bigger number as for now the oldest trial is at the last row while the closest one is in row 1.
Now that I have looked at each missing line and saw which ones to keep, I decided to create a new variable called Distance. I will also to create a new variable called No trial.
For the variable Distance I will replace each row where there was missing data with a value and I will delete the ones where no values could be assigned. This will allow me to have no missing data and find a number to each trial that has been done
Before making the changes i’m gonna make a backup called BackupbeforeDistanceNA
## Number of NA's in DyadDistance after replacements and deletions: 1
## Data size after deletions: 2748
## Row index with NA in DyadDistance: 1925
*It seems that there is still the row 1925 with an NA in DyadDistance
## Row index with NA in DyadDistance:
*In this modification, I added a check to see if the columns Dyadistance and Distance already exist in your dataframe (Bex). If they do, it prints a message saying that the modification has already been applied, and no changes are made. If they don’t exist, it proceeds with the modifications. This way, running the code multiple times won’t cause redundant changes.
Before cleaning Female and Male ID, here is a list of every dyad of the box experiment and their respective groups. This will help us find the missing names when only one individual is missing out of the duo (either male or female):
Sirk & Sey - BD
Ouli & Xin - BD
Piep & Xia - BD
Oerw & Nge - BD
Oort & Kom - BD
Ginq & Sho - AK
Ndaw & Buk - Ak
Xian & Pom - AK
Guat & Pom - Ak
Note that the 4 letter codes correspond to the femaleID, the 3 letter codes to the males ID and the 2 letter codes to the group name of the monkeys
I need to check where are the NA’s in both FemaleID and Male ID by looking at the rows where data is missing. Since every trial was made with a Dyad and never with an single individual, treating these two columns together makes more sense. If both individuals are missing I may have to delete the row.
## Row numbers with missing values in FemaleID: 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1884 1885 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629
## Number of missing values in FemaleID: 59
## Row numbers with missing values in MaleID: 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710
## Number of missing values in MaleID: 18
## Number of rows with missing values in both FemaleID and MaleID: 18
## Row numbers with missing values in both FemaleID and MaleID: 1693, 1694, 1695, 1696, 1697, 1698, 1699, 1700, 1701, 1702, 1703, 1704, 1705, 1706, 1707, 1708, 1709, 1710
## Number of missing values in FemaleID not in MaleID: 41
## Row numbers with missing values in FemaleID not in MaleID: 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 1808, 1809, 1810, 1811, 1812, 1813, 1814, 1815, 1816, 1817, 1818, 1819, 1820, 1884, 1885, 2619, 2620, 2621, 2622, 2623, 2624, 2625, 2626, 2627, 2628, 2629
FemaleID has 41 NA’s while they are 18 NA’s in Male ID
In these missing data, we have 18 NA’s that are in common between FemaleID and MaleID which represents the totality of the missing values in MaleID
All the missing data in MaleID are found in consecutive rows, from row 1693 to row 1710 and are from the group Noha (NH) on the 19th of april 2023. We can also see that trials had bee made in the same day, and looking at the time of the experiment, the previous trials made and the audience we can see that these NA’s in female and male ID we can asses that the individuals involved were Xian for the female ID and Pom for the MaleID. I will thus replace these values using a condtion. These NA’s in Noha (Trial 1693 to 1710) are the only NA’s that MaleID has and are the only NA’s of female ID in Noha. I will thus replace every NA of MaleID NA in Noha with Pom and every Female ID NA in Noha with Xian
## Number of remaining NA values in MaleID after replacement: 0
## Number of remaining NA values in FemaleID after replacement: 41
## Number of rows with missing values in both MaleID and FemaleID after replacement: 0
In order to clean FemaleID, I will use the data from the now complete MaleID. I will use conditions stating that depending which name is found in MaleID when there is an NA in FemaleID, a certain name will have to replace the NA in female ID
Before automating the process I will check manually the data to see if they are any exceptions or mistakes
## Rows with missing values in FemaleID:
## # A tibble: 41 × 16
## Time Date Group MaleID FemaleID FemaleCorn DyadDistance
## <chr> <dttm> <chr> <chr> <chr> <dbl> <dbl>
## 1 09:31:55 2023-07-22 00:00:00 Ankhase Buk <NA> 7 1
## 2 09:33:14 2023-07-22 00:00:00 Ankhase Buk <NA> 7 1
## 3 09:34:07 2023-07-22 00:00:00 Ankhase Buk <NA> 7 0
## 4 09:34:51 2023-07-22 00:00:00 Ankhase Buk <NA> 7 0
## 5 09:36:59 2023-07-22 00:00:00 Ankhase Buk <NA> 7 0
## 6 09:38:13 2023-07-22 00:00:00 Ankhase Buk <NA> 7 1
## 7 09:39:26 2023-07-22 00:00:00 Ankhase Buk <NA> 7 0
## 8 09:41:11 2023-07-22 00:00:00 Ankhase Buk <NA> 0 0
## 9 09:42:17 2023-07-22 00:00:00 Ankhase Buk <NA> 0 0
## 10 09:44:06 2023-07-22 00:00:00 Ankhase Buk <NA> 0 1
## # ℹ 31 more rows
## # ℹ 9 more variables: DyadResponse <chr>, OtherResponse <chr>, Audience <chr>,
## # IDIndividual1 <chr>, IntruderID <chr>, Remarks <chr>, MaleCorn <dbl>,
## # Intrusion <dbl>, AmountAudience <dbl>
If there is NA in femaleID, we will replace the value with - Sirk if MaleID is Sey - Ouli if MaleID is Xin - Piep if MaleID is Xia - Oerw if MaleID is Nge - Oort if MaleID is Kom - Ginq if MaleID is Sho - Ndaw if MaleID is Buk
## # A tibble: 20 × 3
## MaleID FemaleID Count
## <chr> <chr> <int>
## 1 Xia Piep 576
## 2 Sey Sirk 557
## 3 Kom Oort 338
## 4 Sho Ginq 278
## 5 Pom Xian 259
## 6 Buk Ndaw 245
## 7 Xin Ouli 159
## 8 Nge Oerw 153
## 9 Piep Xia 35
## 10 Oort Kom 29
## 11 Ouli Xin 27
## 12 Oerw Nge 19
## 13 Sirk Sey 17
## 14 Buk <NA> 15
## 15 Sey <NA> 13
## 16 Nge <NA> 11
## 17 Buk Ginq 6
## 18 Pom Guat 5
## 19 Xin Oort 4
## 20 Kom <NA> 2
## Number of NA values in MaleID: 0
## Number of NA values in FemaleID: 0
## Rows with missing values in DyadResponse: 871, 1163, 1219, 1339, 1579, 1888, 1962
## Lines with missing values in DyadResponse:
## # A tibble: 7 × 16
## Time Date Group MaleID FemaleID FemaleCorn DyadDistance
## <chr> <dttm> <chr> <chr> <chr> <dbl> <dbl>
## 1 09:39:26 2023-07-22 00:00:00 Ankhase Buk Ndaw 7 0
## 2 10:13:56 2023-06-23 00:00:00 Ankhase Sho Ginq 0 4
## 3 08:34:45 2023-06-17 00:00:00 Baie Dan… Kom Oort 3 2
## 4 08:54:12 2023-06-09 00:00:00 Baie Dan… Xia Piep 1 0
## 5 13:35:08 2023-05-03 00:00:00 Baie Dan… Kom Oort 5 3
## 6 13:27:30 2023-01-18 00:00:00 Ankhase Buk Ndaw 5 4
## 7 08:36:49 2022-12-13 00:00:00 Baie Dan… Kom Oort 8 4
## # ℹ 9 more variables: DyadResponse <chr>, OtherResponse <chr>, Audience <chr>,
## # IDIndividual1 <chr>, IntruderID <chr>, Remarks <chr>, MaleCorn <dbl>,
## # Intrusion <dbl>, AmountAudience <dbl>
Row 871: The previous row was tolerance at 1m and the next tolerance at 0 which means that the row 871 should be Tolerance for DyadResponse
Row 1163: The value can not be found from the other rows so I will delete row 1163
Row 1219: The previous row was not approaching at 2m and the next is tolerance at 2m and tolerance at 1m, which means that the row 1219 should be Tolerance for DyadResponse
Row 1339: The previous row was tolerance at 0m while the next one was tolerance at 0m, which means that the tow 1339 should be Tolerance for DyadResponse
Row 1579: The value can not be found from the other rows so I will delete row 1579
Row 1888: The value can not be found from the other rows so I will delete row 1888
Row 1962: The value can not be found from the other rows so I will delete row 1962
## Number of remaining NA values in DyadResponse: 0
## Final check of NA values in Bex:
## Time Date Group MaleID FemaleID
## 0 0 0 0 0
## FemaleCorn DyadDistance DyadResponse OtherResponse Audience
## 0 0 0 0 0
## IDIndividual1 IntruderID Remarks MaleCorn Intrusion
## 0 0 0 0 0
## AmountAudience
## 0
Since I have removed all the missing data from the different columns, I now have to correct potential mistakes that can be found and create new variables to be able to manipulate better my data.
Since the column remarks contains corrections and additional information, I will treat it now
Before that lets check how many remarks we have in our dataset, how many of the main keywords we can find and make a visual representation of it
## Number of 'No Remarks' entries: 2139
## Number of actual remarks entries: 603
## Total number of keyword occurrences in the Barplot: 822
## Glimpse of the Bex Before treating Remarks:
## Rows: 2,742
## Columns: 16
## $ Time <chr> "09:47:50", "09:50:07", "09:53:11", "09:54:28", "09:55:…
## $ Date <dttm> 2022-09-27, 2022-09-27, 2022-09-27, 2022-09-27, 2022-0…
## $ Group <chr> "Baie Dankie", "Baie Dankie", "Baie Dankie", "Baie Dank…
## $ MaleID <chr> "Nge", "Nge", "Nge", "Nge", "Nge", "Nge", "Nge", "Nge",…
## $ FemaleID <chr> "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", "Oerw",…
## $ FemaleCorn <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 7, 7, 7…
## $ DyadDistance <dbl> 2, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 1, 1, 0, 0…
## $ DyadResponse <chr> "Tolerance", "Tolerance", "Tolerance", "Tolerance", "To…
## $ OtherResponse <chr> "No Response", "No Response", "No Response", "No Respon…
## $ Audience <chr> "Obse; Oup; Sirk", "Obse; Oup; Sirk", "Oup; Sirk", "Sir…
## $ IDIndividual1 <chr> "No individual", "No individual", "No individual", "No …
## $ IntruderID <chr> "No Intrusion", "No Intrusion", "No Intrusion", "No Int…
## $ Remarks <chr> "No Remarks", "No Remarks", "Nge box did not open becau…
## $ MaleCorn <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
## $ Intrusion <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0…
## $ AmountAudience <dbl> 3, 3, 2, 1, 2, 2, 2, 1, 1, 2, 6, 6, 3, 2, 2, 2, 2, 2, 2…
## [1] "/Users/maki/Desktop/Master Thesis/BEX 2223 Master Thesis Maung Kyaw/IVPToleranceBex2223"
Before treating all the data in the Remarks I will create a few columns to redistribute information
Also whenever i will have treated a remark, i will replace it with “Treated”. And if I have to delete the row I’ll write “Delete”. After re importing the data I will make a count of these changes to see if I still have the correct amount of cells and changes that have been done
a.Context: BoxMalfunction, BoxOpenedBefore, NoExperiment, Agonistic, Guat;Ap;Xian, CornLeak, BetweenGroupEncounter, ContactCalling,
b.SpecialBehaviour Oerw;Vo;Exp, Sey;Ap;AfterOpen, Oerw;Vo;Exp,Nge;Vo;Exp, Sirk;ApAfter30, Sirk;Av;Oerw, Oerw;Lo,Sey;Sf;Oort,Oort;At;Kom, Kom;Ap;AfterOpen, Sey;Ch,Sirk, Xin;Hesitation. Xia;Sf;Piep, Pom;Sf;Xian,Kom;Sf;Oort, Sey;Sf;Sirk, Xia;Sf;Piep,Piep;Sf,XIa, Oort;At;Kom, Sey;Rt;Sho;Ap, Sho;Rt;Ginq;Ap, Buk;Sf;Ndaw, Sho;Rt;Ndaw;Ap, Oort;Sf;Kom, Ginq;Sho;Ap;After30, Ndaw;Sc,Buk;Sf, Ndaw;Ap;After30, Kom;Ap;After30, Xia;Piep;Ap;After30, Pom;Bi;Xian, Sho;Ndaw;Av;Buk, Kom;Sf;Oort, Kom;St;Oort,Oort;St;Kom, Sey;Hi;Sirk, Obse;Ap;Piep;Av,Piep;Sf;Xia, Sirk;ApWhenPartnerLeft, Sey;Hh;Sirk, Xia;Sf;Piep;Sc, Xia;Piep;ShareFood, Piep;Ap;After30,Xia;Mu;Piep, Oort;St;Sirk;Ja,Sey;Sf, Pom;Sf;Xian, Ndaw;ApWhenPartnerLeft, Xian;At;Pom,Gaya;Su, Xian;Sf;Pom, Xian;Hesitation, Xia;ApWhenPartnerLeft,Sirk;Hesitation, Ginq;Hesitation, Sey;Ap;Kom;Av, Oort;Sc;Kom, Xian; Pom, Pom;Ap;Xian, Pom;Ap;Xian,Xian;Rt, Sey;Ap;Sirk;Rt, Sey;St;Sirk;Ig, Xia;Asf;Piep, Piep;ApWhenPartnerLeft, Sho;Ap;After30, Ginq;ApWhenPartnerLeft, Pom;Sf;Xian;Sf;Pom, Xian;ApWhenPartnerLeft, Piep;Ch;Sirk, Sey;St;Sirk, Ndaw;Ap;After30, Xian;Ap;After30, Xian;St;Prai, Pom;Sf;Xian;Vc, Kom;Ap;After30, Kom;ApproachWithPartner, Oort;ApWhenPartnerLeft, Sho;Ap;After30,Ginq;Ap;After30, Ginq;ApproachWithPartner, Ndaw;Hesitation, Oerw;Hesitation, Oerw;ApWhenPartnerLeft. Piep;Ap;After30, Sirk;Ap;After30, Xia;Ap;After30, Ouli;Gr;BBOuli, Oerw;Ap;After30, Sirk;Hesitation, Sey;Ap;Sirk;Av, Ouli;Ap;Xia;Av, Xin;Ap;After30, Sho;Sf;Ginq;Sc, Xia;ApWhenPartnerLeft, Sey;Ap;Sirk;Ja, Nge;Oerw;ShareFood, Nge;Ap;Oerw;Oerw;At,Obse;At;Nge,
c.GotCorn: No;Nge, No;Piep, No;Xian, No;Oort, No;Sirk, No;Kom, No;Ndaw, No;Kom, No;Oort, No;Xia, No;Buk, No;Sho, No;Sey,No;Piep, No;Ginq
IntruderID: Sey, Oerw, Guat, Kom, Gris, Sho, Oerw; Ouli, Guat; Gri, Xop, Obse, Oort, Obse; Sey, Ginq; Ghid, Xia, Grif, Sey, Gree; Gran, Godu; Gub, Gran, Oerw; Nak, Ghid, Buk, Oup
DyadDistance: 6, 7, 8, 9 , 1
Audience: UnidentifiedAudience, Ouli; Riss, Gris, Sey, Sey; Piep; Sirk, Oup Ome
IDIndividual1: Piep, Oort; Kom, Ndaw; Buk, Sho; Ginq, Ndaw, Buk, Xian, Pom, Oort; Kom, Buk; Ndaw, Sirk; Sey, Xin; Ouli, Oerw; Nge
DyadResponse: Tolerance, Not approaching; Losing interest, Losing interest; Intrusion
## Rows: 2,742
## Columns: 19
## $ Time <chr> "09:47:50", "09:50:07", "09:53:11", "09:54:28", "09:5…
## $ Date <dttm> 2022-09-27, 2022-09-27, 2022-09-27, 2022-09-27, 2022…
## $ Group <chr> "Baie Dankie", "Baie Dankie", "Baie Dankie", "Baie Da…
## $ MaleID <chr> "Nge", "Nge", "Nge", "Nge", "Nge", "Nge", "Nge", "Nge…
## $ FemaleID <chr> "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", "Oerw", "Oerw…
## $ FemaleCorn <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 7, 7,…
## $ DyadDistance <dbl> 2, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 1, 1, 0,…
## $ DyadResponse <chr> "Tolerance", "Tolerance", "Tolerance", "Tolerance", "…
## $ OtherResponse <chr> "No Response", "No Response", "No Response", "No Resp…
## $ Audience <chr> "Obse; Oup; Sirk", "Obse; Oup; Sirk", "Oup; Sirk", "S…
## $ IDIndividual1 <chr> "No individual", "No individual", "No individual", "N…
## $ IntruderID <chr> "No Intrusion", "No Intrusion", "No Intrusion", "No I…
## $ Remarks <chr> "No Remarks", "No Remarks", "Treated", "Treated", "No…
## $ MaleCorn <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
## $ Intrusion <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
## $ AmountAudience <dbl> 3, 3, 2, 1, 2, 2, 2, 1, 1, 2, 6, 6, 3, 2, 2, 2, 2, 2,…
## $ Context <chr> "NoContext", "NoContext", "BoxMalfunction", "BoxOpene…
## $ SpecialBehaviour <chr> "NoSpecialBehaviour", "NoSpecialBehaviour", "Oerw;Vo;…
## $ GotCorn <chr> "Yes", "Yes", "No;Nge", "Yes", "Yes", "Yes", "Yes", "…
## Number of NA entries in Context: 0
## Number of NA entries in SpecialBehaviour: 0
## Number of NA entries in GotCorn: 0
## Number of NA entries in BexClean: 0
Time : I considered looking at the time sections in which we did the expermiment. I will thus look at the time ranges (max and min in the day / latest and earliest time) before separating the day in different sections to have an idea in which part of the day most of the experiments occured. This will not be used in my analysis, but if I wanted to, I could interesting to compare the amount of experimentations made per day and have a line indicating the time of sunrise.
The Minimum Time in the dataset is 06:03:26* while the Maximum Time is at 16:36:59
In my box experiment I have this variable called time that tells me when the experiment was done. I don’t think I need this information per se. I was wondering if it could be easy and interesting to see from when to when the time occurs and then separate this time in a few sections like early, monring, morning, miday, afternoon, end of the day
a.6 to 8 : Early morning b.8 to 10: Morning c.10 to 12: Noon d.12 to 14: Afternoon e.14 to 17: End of the day
Last, I want to create a variable called Hour that will take the value in Time and round it to the hour in which it is ex: from 06:00 to 06:59 -> 6, from 07:00 to 07:59 -> 7 etc…
This will allow me to see when most of the trials occured with more detail and I will be to see in which hour most of the trial happened. Nevertheless Period will be better for an improved readability
## Unique Female IDs: Sirk Ginq Piep Oerw Xin Ndaw Xia Sey Ouli Nge Oort Xian Guat Kom
## Unique Male IDs: Sey Sho Xia Nge Ouli Buk Piep Sirk Xin Oerw Kom Pom Oort
| Buk | Kom | Nge | Oerw | Oort | Ouli | Piep | Pom | Sey | Sho | Sirk | Xia | Xin | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ginq | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 277 | 0 | 0 | 0 |
| Guat | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 |
| Kom | 0 | 0 | 0 | 0 | 29 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Ndaw | 259 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Nge | 0 | 0 | 0 | 19 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Oerw | 0 | 0 | 164 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Oort | 0 | 337 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 |
| Ouli | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 159 |
| Piep | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 575 | 0 |
| Sey | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 17 | 0 | 0 |
| Sirk | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 570 | 0 | 0 | 0 | 0 |
| Xia | 0 | 0 | 0 | 0 | 0 | 0 | 35 | 0 | 0 | 0 | 0 | 0 | 0 |
| Xian | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 259 | 0 | 0 | 0 | 0 | 0 |
| Xin | 0 | 0 | 0 | 0 | 0 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
## Unique Dyads: Sey Sirk Sho Ginq Xia Piep Nge Oerw Xin Ouli Buk Ndaw Buk Ginq Kom Oort Pom Xian Pom Guat Xin Oort
| Var1 | Freq |
|---|---|
| Buk Ginq | 6 |
| Buk Ndaw | 259 |
| Kom Oort | 366 |
| Nge Oerw | 183 |
| Pom Guat | 5 |
| Pom Xian | 259 |
| Sey Sirk | 587 |
| Sho Ginq | 277 |
| Xia Piep | 610 |
| Xin Oort | 4 |
| Xin Ouli | 186 |
## Unique Male-Female Combinations:
## # A tibble: 11 × 2
## Male Female
## <chr> <chr>
## 1 Sey Sirk
## 2 Sho Ginq
## 3 Xia Piep
## 4 Nge Oerw
## 5 Xin Ouli
## 6 Buk Ndaw
## 7 Buk Ginq
## 8 Kom Oort
## 9 Pom Xian
## 10 Pom Guat
## 11 Xin Oort
## [1] "Wrong Rows:"
## [1] 613 614 615 616 617 931 2710 2711 2712 2713
## [1] "Wrong Dyads:"
## [1] "Buk Ginq" "Buk Ginq" "Buk Ginq" "Buk Ginq" "Buk Ginq" "Buk Ginq"
## [7] "Xin Oort" "Xin Oort" "Xin Oort" "Xin Oort"
They are a 10 wrong dyads that I will have to identify in the dataset and manually correct, those wrong dyads to change and identify are: -Buk Ginq - 6 occurences -Xin Oort - 4 occurences
I will change the occurences of Buk Ginq to Sho Ginq for row 613 to 617 and row 931. I know these trials are with Sho Ginq because the comments mentioned Sho in them while Male(ID) gave Buk which was a mistake
For the rows from 2710 to 2713 since, Ouli is in the audience it is unlikely that we had trials with the dyad Xin Ouli. Also I think they are little chances that the names of both individuals were entered wrong. I will replace these occurences where we had Xin Oort by Kom Oort
I thus want Buk to be replaced in male ID in rows 613 to 617 and row 913 with Sho and, Xin to be replaced by Kom in rows 2710 to 2713 in Male ID before updating Dyad
If Rows 613 to 617 and 931 are coded with Buk for MaleId and Ginq for FemaleId replace Male by Sho
## Rows to correct Sho Ginq:
## [1] 613 614 615 616 617 931
## Rows to correct Kom Oort:
## [1] 2710 2711 2712 2713
## Wrong Rows After Correction:
## integer(0)
## Wrong Dyads After Correction:
## character(0)
## All dyads are now correct.
## Unique Dyads after correction: Sey Sirk Sho Ginq Xia Piep Nge Oerw Xin Ouli Buk Ndaw Kom Oort Pom Xian Pom Guat
| Var1 | Freq |
|---|---|
| Buk Ndaw | 259 |
| Kom Oort | 370 |
| Nge Oerw | 183 |
| Pom Guat | 5 |
| Pom Xian | 259 |
| Sey Sirk | 587 |
| Sho Ginq | 283 |
| Xia Piep | 610 |
| Xin Ouli | 186 |
## Number of rows changed to Sho Ginq: 6
## Number of rows changed to Kom Oort: 4
Create the variable called Trial where the data will be sorted by date and dyad in order to see how many trials have been done with each individual: One row (per dyad) = one trial and the variable called Day where the data will be sorted by date and dyad and day in order to see how many sessions have been done with each individual: One day (per dyad) = one session Now, let’s proceed with creating the Dyad variable, Trial, and Day:
Make a summary of trial and session so I can see see how many trials and sessions have been done with the individuals
## Trial Summary:
| Dyad | Amount of Trials |
|---|---|
| Xia Piep | 610 |
| Sey Sirk | 587 |
| Kom Oort | 370 |
| Sho Ginq | 283 |
| Buk Ndaw | 259 |
| Pom Xian | 259 |
| Xin Ouli | 186 |
| Nge Oerw | 183 |
| Pom Guat | 5 |
##
## Day Summary:
| Dyad | Number of Days |
|---|---|
| Xin Ouli | 276 |
| Xia Piep | 249 |
| Sho Ginq | 200 |
| Sey Sirk | 165 |
| Pom Xian | 112 |
| Pom Guat | 93 |
| Nge Oerw | 92 |
| Kom Oort | 70 |
| Buk Ndaw | 39 |
## Change in Rows: -5
## Placement columns were created successfully.
## Maximum Distance: 10
## Minimum Distance: 0
Reminder: The different behaviors that are coded in DyadResponse are: Distracted, Female aggress male, Male aggress female, Intrusion, Loosing interest, Not approaching, Tolerance and Other
I will create some tables to have a better understanding of the state of the column dyadresponse and the different existing combinations at this point
Also I will create the hierarchy before implementing in the dataset
## Number of rows with multiple entries in DyadResponse: 271
## Rows with multiple entries in DyadResponse: 3 4 13 53 58 64 68 71 72 84 88 89 97 98 113 133 137 192 194 244 275 277 278 284 289 298 302 304 318 322 326 327 328 331 341 346 347 367 368 387 388 391 397 409 454 468 469 482 497 510 511 522 529 536 597 600 626 629 687 696 706 711 713 726 740 748 749 757 760 761 763 768 769 780 786 818 829 841 843 845 861 866 877 888 892 898 912 919 934 937 938 954 955 962 973 997 1004 1015 1027 1042 1048 1060 1108 1141 1143 1161 1162 1190 1209 1213 1217 1225 1229 1236 1240 1243 1245 1247 1254 1267 1271 1284 1288 1293 1308 1310 1311 1326 1327 1330 1336 1341 1343 1352 1378 1398 1399 1407 1416 1461 1464 1473 1478 1484 1488 1489 1495 1510 1520 1521 1528 1558 1595 1598 1602 1607 1616 1629 1630 1633 1636 1639 1642 1645 1654 1658 1679 1708 1714 1718 1724 1741 1750 1752 1784 1785 1788 1795 1800 1801 1802 1804 1805 1806 1807 1808 1815 1816 1819 1820 1821 1822 1824 1825 1828 1829 1830 1831 1832 1837 1843 1856 1857 1858 1889 1897 1968 1984 2023 2066 2088 2090 2091 2100 2101 2102 2103 2104 2105 2109 2112 2113 2147 2148 2149 2152 2153 2166 2175 2184 2185 2192 2193 2194 2199 2201 2202 2205 2218 2226 2237 2248 2257 2290 2294 2343 2353 2354 2359 2360 2362 2363 2370 2388 2395 2396 2400 2465 2466 2467 2493 2534 2601 2609 2626 2629 2630 2643 2651 2657 2730
## Total number of rows in DyadResponse: 2737
## Number of rows with a single entry in DyadResponse: 2466
## Number of rows with multiple entries in DyadResponse: 271
## Sum of single and multiple entry rows: 2737
##
##
## Table: Unique DyadResponse Combinations and Their Frequencies
##
## |Combination | Frequency|
## |:-------------------------------------------------|---------:|
## |Intrusion;Tolerance | 49|
## |Losing interest;Not approaching | 47|
## |Intrusion;Not approaching | 30|
## |Male aggress female;Tolerance | 27|
## |Looks at partner;Tolerance | 23|
## |Losing interest;Tolerance | 22|
## |Female aggress male;Tolerance | 19|
## |Looks at partner;Not approaching | 9|
## |Other;Tolerance | 7|
## |Distracted;Not approaching | 6|
## |Not approaching;Tolerance | 4|
## |Distracted;Losing interest | 3|
## |Female aggress male;Male aggress female;Tolerance | 3|
## |Female aggress male;Not approaching | 3|
## |Distracted;Tolerance | 2|
## |Female aggress male;Intrusion;Tolerance | 2|
## |Male aggress female;Not approaching | 2|
## |Distracted;Intrusion;Not approaching | 1|
## |Distracted;Intrusion;Tolerance | 1|
## |Female aggress male;Intrusion | 1|
## |Intrusion;Losing interest | 1|
## |Intrusion;Losing interest;Not approaching | 1|
## |Intrusion;Male aggress female | 1|
## |Intrusion;Not approaching;Tolerance | 1|
## |Looks at partner;Losing interest;Not approaching | 1|
## |Looks at partner;Male aggress female | 1|
## |Looks at partner;Male aggress female;Tolerance | 1|
## |Looks at partner;Other;Tolerance | 1|
## |Losing interest;Not approaching;Tolerance | 1|
## |Not approaching;Other | 1|
##
##
## Table: Rows with Single Responses in DyadResponse
##
## |DyadResponse | Frequency|
## |:-------------------|---------:|
## |Tolerance | 1809|
## |Not approaching | 465|
## |Male aggress female | 73|
## |Intrusion | 51|
## |Losing interest | 37|
## |Female aggress male | 18|
## |Other | 9|
## |Distracted | 4|
##
##
## Table: Rows with Multiple Responses in DyadResponse
##
## |Combination | Frequency|
## |:-------------------------------------------------|---------:|
## |Intrusion;Tolerance | 49|
## |Losing interest;Not approaching | 47|
## |Intrusion;Not approaching | 30|
## |Male aggress female;Tolerance | 27|
## |Looks at partner;Tolerance | 23|
## |Losing interest;Tolerance | 22|
## |Female aggress male;Tolerance | 19|
## |Looks at partner;Not approaching | 9|
## |Other;Tolerance | 7|
## |Distracted;Not approaching | 6|
## |Not approaching;Tolerance | 4|
## |Distracted;Losing interest | 3|
## |Female aggress male;Male aggress female;Tolerance | 3|
## |Female aggress male;Not approaching | 3|
## |Distracted;Tolerance | 2|
## |Female aggress male;Intrusion;Tolerance | 2|
## |Male aggress female;Not approaching | 2|
## |Distracted;Intrusion;Not approaching | 1|
## |Distracted;Intrusion;Tolerance | 1|
## |Female aggress male;Intrusion | 1|
## |Intrusion;Losing interest | 1|
## |Intrusion;Losing interest;Not approaching | 1|
## |Intrusion;Male aggress female | 1|
## |Intrusion;Not approaching;Tolerance | 1|
## |Looks at partner;Losing interest;Not approaching | 1|
## |Looks at partner;Male aggress female | 1|
## |Looks at partner;Male aggress female;Tolerance | 1|
## |Looks at partner;Other;Tolerance | 1|
## |Losing interest;Not approaching;Tolerance | 1|
## |Not approaching;Other | 1|
### 3.9.3 Dyad Response Hierarchy
Dyad Response Hierarchy Projection of the hierarchy (changes will be made) - Aggression > Tolerance - Tolerance > Not approaching -> Create a variable called hesistant in addtion to the tolerance count to see frequency of tolerance behaviour that happened after > 1min - Tolerance > Loosing interest - Tolerance > Intrusion
- Not approaching = looking box but not coming while Loosing interest = not paying attention to the box - Intrusion > Loosing interest - Intrusion > Not approaching - Not approaching > Looks at partner - We can code every look at partner as no approaching and keep the count of looks at partner as additional information
- Not approaching >?> Loosing interest ? !! - Define distracted - Not approaching > Distracted - Aggression > Not approaching - Other > Look case by case and categorize depending of behavior
- Remarks may be used for the same reason
First I want to see how many rows in DyadResponse have more than one entry per cell
Now that I know that they are 230 rows with multiple entries I will print each combinations to see which one I have to print and also display the combinations once the data is split.
I will make an intermediary summary to see that state of DyadResponse at the moment
A. Summary od DyadResponse
B.Idnetify Rows with more than 1 entry
Step 1: Identify Rows with More than 1 Entry beforesummaryrowswithmultipleentries_1 <- which(sapply(BexClean$DyadResponse, function(x) length(unlist(strsplit(as.character(x), “;”))) > 1))
Display the rows with more than 1 entry in DyadResponse knitr::kable(BexClean[beforesummaryrowswithmultipleentries_1, ])
Print the amount of cells with more than 1 entry and the total number of rows cat(“Number of rows with more than 1 entry in DyadResponse:”, length(beforesummaryrowswithmultipleentries_1), “”)
C. Identify Rows with more than 2 entry
## Number of rows with more than 1 entry in DyadResponse: 271
## Unique Combinations and Counts for More than 1 Entry:
## Distracted & Intrusion 2
## Female aggress male & Intrusion 2
## Looks at partner & Other 1
## Losing interest & Intrusion 1
## Losing interest & Looks at partner 1
## Male aggress female & Female aggress male 3
## Male aggress female & Looks at partner 1
## Not approaching & Intrusion 1
## Not approaching & Losing interest 1
## Distracted & Losing interest 3
## Female aggress male & Intrusion 1
## Female aggress male & Not approaching 3
## Losing interest & Intrusion 1
## Male aggress female & Intrusion 1
## Male aggress female & Looks at partner 1
## Male aggress female & Not approaching 2
## Not approaching & Distracted 7
## Not approaching & Intrusion 32
## Not approaching & Looks at partner 10
## Not approaching & Losing interest 49
## Not approaching & Other 1
## Tolerance & Distracted 3
## Tolerance & Female aggress male 24
## Tolerance & Intrusion 53
## Tolerance & Looks at partner 25
## Tolerance & Losing interest 23
## Tolerance & Male aggress female 31
## Tolerance & Not approaching 6
## Tolerance & Other 8
## Change in Occurrences for the Chunk (Female aggress male > Not approaching): 2731 2737 2737 2478 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2400 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2573 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2718 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2708 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2710 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2702 2737 2737 2737 2732 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2478 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2167 2737 2737 2737 2460 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2720 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2162 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2737 2733 2578 2737 2737 2737 2737 2737 2737
## Change in Occurrences for the Chunk (Male aggress female > Not approaching): -2
## Change in Occurrences for the Chunk Tolerance > Distracted: -3
8.Female agress male > Tolerance: Remove Tolerance if there is Female agress male (23x)
## Change in Occurrences for the Chunk (Tolerance & Female aggress male): -24
9.Tolerance > Loosing interest: Remove Losing interest if there is Tolerance (13x)
## Change in Occurrences for the Chunk (Tolerance & Losing interest): 0
10.Male agress female > Tolerance : Remove Tolerance if there is Male agress male (31)
## Change in Occurrences for the Chunk (Tolerance & Male aggress female): -28
11.Tolerance > Not approaching: Remove Not approaching if there is tolerance (6x)
## Change in Occurrences for teh Chunk (Tolerance & Not approaching): -6
SEE WHEN AND HOW TO Put ThIS CODE TO REPLACE PREIVOSU NUMBERS:
BexClean <- BexClean[!(grepl(“Tolerance”, BexClean\(DyadResponse) & (grepl("Female aggress male", BexClean\)DyadResponse) | grepl(“Male aggress female”, BexClean$DyadResponse))), ]
BexClean <- BexClean[!(grepl(“Not approaching”, BexClean\(DyadResponse) & grepl("Tolerance", BexClean\)DyadResponse)), ]
BexClean <- BexClean[!(grepl(“Losing interest”, BexClean\(DyadResponse) & grepl("Tolerance", BexClean\)DyadResponse)), ]
BexClean <- BexClean[!(grepl(“Intrusion”, BexClean\(DyadResponse) & grepl("Tolerance", BexClean\)DyadResponse)), ]
BexClean <- BexClean[!(grepl(“Losing interest”, BexClean\(DyadResponse) & grepl("Intrusion", BexClean\)DyadResponse)), ]
BexClean <- BexClean[!(grepl(“Not approaching”, BexClean\(DyadResponse) & grepl("Intrusion", BexClean\)DyadResponse)), ]
BexClean <- BexClean[!(grepl(“Distracted”, BexClean\(DyadResponse) & grepl("Not approaching", BexClean\)DyadResponse)), ]
knitr::kable(BexClean, col.names = colnames(BexClean), caption = “Cleaned DyadResponse Data”)
Code to display rows numbers and the response in DyadResponse that have mroe than one entry
Other Reponse - DEtailed cleaning to delete the column
Audience - Creation of Amount Audience and Density
ID Individua1 - Not sure yet
Intruder ID
Remarks - Detailed Cleaning
Intrusion
Not Approaching
Lossing Interest
Distracted
MultipleResponse
Amount Audience
DyadDistance
Distance
No trial
I also chose to directly create new dichotomic variables for “Not approaching”, “Intrusion”, “Losing interest”, “Distracted”, for this I would like the function to 1.Check if there is a value different than “No individual” 2.If the value ≠“No individual” then I want it to take the response found in “DyadResposne”
## The 'MultipleResponses' column already exists. No changes made.
## Occurrences of 'Not approaching' in DyadResponse: 564
## Occurrences of 'Intrusion' in DyadResponse: 135
## Occurrences of 'Losing interest' in DyadResponse: 112
## Occurrences of 'Distracted' in DyadResponse: 14
## Warning: Unknown or uninitialised column: `NotApproaching`.
## Warning: Unknown or uninitialised column: `LosingInterest`.
## Warning: Unknown or uninitialised column: `Distracted`.
## Occurrences of '1' in 'NotApproaching': 0
## Occurrences of '1' in 'Intrusion': 56
## Occurrences of '1' in 'LosingInterest': 0
## Occurrences of '1' in 'Distracted': 0
####Dyad, Distance & Date
Trials of grpahs, I will have to check all of them
My goal here is too see if each dyad have an general evolution of their dyad distance trough time and how many varaition do they have
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using formula = 'y ~ x'
## `summarise()` has grouped output by 'Dyad'. You can override using the
## `.groups` argument.
* I may consider, in parallel of my hypothesis, to separate the data in *4 seasons* to make a preliminary check of a potential effect of seasonality. Nevertheless the fact that we did not use anywithout tools to mesure the weather and the idea to make a categorization in 4 seasons without considering the actua quite arbitrary. I may do it but with no intention to include this in my scientific report.
l temperature, food quantitiy and other elements related to seasonailty make this categorizationn a categorization where 12 months of data will be separated in 4 categories
#Lines to check unique values in MaleFemaleID to see if they are any problems with it # Unique values in MaleID unique_male_ids <- unique(BexClean$MaleID)
unique_female_ids <- unique(Bex$FemaleID)
What factors influence the rate at which individuals (vervets) learn to tolerate each other in a controlled box experiment?
Ex: The rate at which individuals (vervets) learn to tolerate each other in a box experiment is influenced by social factors (audience, social network, behavior of the partner) and idioyncratic factors (age, rank)
The presence of a higher number of high-ranking individuals in the audience will negatively correlate with the level of tolerance achieved among vervets in the box experiment. This is expected to result in higher frequencies of aggressive behaviors, intrusions, and loss of interest, particularly from lower-ranking individuals.
Vervets tolerance levels in the box experiment will be influenced by their partner’s display of agonistic behaviors. Specifically, partners who exhibit more frequent agonistic behaviors towards their partner will lead to decrease in their motivation to participate in future trials.
During the box experiment, vervet dyads will establish an “optimal” distance for interaction, characterized by a higher frequency of tolerance compared to other distances. This optimal distance is expected to signify that the individuals tolerate each other more effectively at this specific proximity .
The age and rank of individual vervets within the group will influence the success of the trials in the box experiment. Specifically, older and higher-ranking individuals are expected to exhibit lower rates of success compared to dyads consisting of younger and lower-ranked individuals. This decrease in success is anticipated to be associated with a higher frequency of aggressive behaviors displayed by older and higher-ranking individuals towards their partners. (I’m not sure this hypothesis makes sens, I have the feeling age and rank must have an influence but I don’t know how to put it, I will think about it)
Seasonality is expected to impact the motivation of vervet dyads to participate in the box experiment. We hypothesize that dyads will have lower motivation, as indicated by a reduced number of trials, during the summer months compared to the winter months. This difference in motivation is likely influenced by temperature and food availability. To test this hypothesis, we will categorize the data into four seasonal periods, each spanning four months, and analyze whether there is a significant effect of seasonality on the motivation to engage in the trials.
Variables Needed:
DyadResponse (specifically, “aggression” responses) Amountaudience (to measure the number of individuals in the audience) Audience…15 (to identify the names of individuals in the audience for calculating dominance ranks) Elo rating of the individuals based on the ab libitum data collected in IVP (which I have to calculate asap)
Statistical Analysis:Logistic Regression, as it could analyze the influence of high-ranking individuals on the occurrence of aggression in dyad responses. This will help determine whether the presence of high-ranking individuals affects the likelihood of aggression.
Variables Needed:
Statistical Analysis: Logistic Regression as it could be used to assess how the occurrence of aggression in dyad responses is influenced by the partner’s gender-specific agonistic behaviors.
Variables Needed:
Statistical Analysis: generalized Linear Model (GLM) to investigate whether there is an optimal distance that leads to a higher likelihood of tolerance (Tolerance = 1).
Variables Needed:
Statistical Analysis: Logistic Regression Logistic regression can be employed to determine whether the age and rank of individual vervets within dyads have an impact on the likelihood of tolerance (Tolerance = 1).
Variables Needed:
Statistical Analysis:
ANOVA or Kruskal-Wallis Test: Depending on the distribution of your trial data, you can use either ANOVA (if the data are normally distributed) or the Kruskal-Wallis test (for non-normally distributed data) to assess the impact of seasonality on the number of trials. If significant differences are found, you can follow up with post-hoc tests to identify which seasons differ from each other. Please note that the effectiveness of these analyses may depend on the distribution of your data and specific research objectives. You may also consider conducting exploratory data analysis (e.g., visualization) to gain a better understanding of your dataset before performing these analyses. Additionally, if you have specific questions about data preprocessing or variable transformations, feel free to ask for further guidance. –> I took this from ChatGPT, I have to look more into it
REMARKS: So here are a few updates I made in the document. I also planned to send my cleaned data to Radu (the statistician of UNINE) as he was keen to help me find the right test. Of course I will also look again in Bshary’s and Charlotte’s work with the boxes and improve these suggestions that are quite simple for now
Also I still have to clean the last grpahs about male/female aggression as I didn’t finish that yet. I juste wanted to share my hypothesis and ideas for statistics so I can soon go into the “serious” work
Anyway, thank you in advance for your help <3
Michael
But: intro need triangle shape: broad to narrow end wiht research question> tolerance importance > animal reign, actual knowledge/ direction knowledge we need > show how my experiment goes in that way How to adress the gap, answer with research question
Then explain why choosing vervet monkeys, (IVP in methods), sociality, experiments made
Methods
IVP, research area, (goal, house, type people)
Population: groups, dyads, male/female, ranks..
Box material: boxes, remotes, batteries, camera, tripod, corn (no marmelade ;), (water spray, security reason, non agressive way to select individuals and not engage with mokeys when reachrging boxes with corn), pattern, previous distances, tablets, box experiment form
Tablets
(No observers mentionned)
Habituation boxes > individuals trained to recognice boxes, they have differernt levels of habituation
Patterns > appendix, mention similar to habituation, use to recognize box but efficieny depeds of experience)
Selection dyads > assigment from elo rating (different rank), if above average bond no dyad made, if not possible, availibilty of monkey also factor !! Non random can be a problem, think about why and how you selected data We created variations in dyads made by different sex, rank and not above average bonde (calculate bondeness)
Amount corn, do you want to mention it> maybe important Calculate corn during and placement cf paper on corn /food motivation
Corn (daily intake vervet % made from corn, cf site we saw, cf screenshot, comapre paper previousely made an all)
1st dyad trial (BD) > appendix
Videos > details appendix
Finding dyads > appendix
Placement to attract them > meniton if statiscial made on placement corn
Trials (1 session = max 15 trials/in total) (session could be broken in different sub sessions to reach 15 trials max)
If agression > 1m / If 2x tolerance < 1m , also if not approaching > 1m ( if no tolerance increase distance except if intrusion) (borgeaud > expectation fo aggression)
Time of the day > appendix
Territory? > appendix
Amount sessions p day/week, how we chose the moment to follow them >appendix
Problems/ unplanned events: weather, BGE’s, not finding the monkeys (group, dyad or individual), dispersal of males, river crossing, inacessibility (experiments or boxes), low vision (experiments or monkeys),> appendix
(Where do i mention the confounding variables?) > look in litterature, if something that could affect and already reported in papers check, oterhwise exclude “normal life” factors for both monekys and Experimenter
Types of experimental plan
Statistical tests (for each hypothesis)
Analysis
Results
Interpretation
Conclusion
• Pisor, A. C., & Surbeck, M. (2019). The evolution of intergroup tolerance in nonhuman primates and humans. Evolutionary Anthropology: Issues and ReViews. Advance online publication. https://doi.org/10.1002/evan.21793 (Pisor & Surbeck, 2019)
| Date | Time | Data | Group |
|---|---|---|---|
| 2022-09-27 | 1899-12-31 09:47:50 | Box Experiment | Baie Dankie |
| 2022-09-27 | 1899-12-31 09:50:07 | Box Experiment | Baie Dankie |
| 2022-09-27 | 1899-12-31 09:53:11 | Box Experiment | Baie Dankie |
| 2022-09-27 | 1899-12-31 09:54:28 | Box Experiment | Baie Dankie |
| 2022-09-27 | 1899-12-31 09:55:19 | Box Experiment | Baie Dankie |
| 2022-09-27 | 1899-12-31 09:56:56 | Box Experiment | Baie Dankie |
| GPSS | GPSE | MaleID | FemaleID |
|---|---|---|---|
| -28.010549999999999 | 31.191050000000001 | Nge | Oerw |
| -28.010549999999999 | 31.191050000000001 | Nge | Oerw |
| -28.010549999999999 | 31.191050000000001 | Nge | Oerw |
| -28.010549999999999 | 31.191050000000001 | Nge | Oerw |
| -28.010549999999999 | 31.191050000000001 | Nge | Oerw |
| -28.010549999999999 | 31.191050000000001 | Nge | Oerw |
| Male placement corn | MaleCorn | FemaleCorn | DyadDistance | DyadResponse |
|---|---|---|---|---|
| NA | 3 | NA | 2m | Tolerance |
| NA | 3 | NA | 2m | Tolerance |
| NA | 3 | NA | 1m | Tolerance |
| NA | 3 | NA | 1m | Tolerance |
| NA | 3 | NA | 0m | Tolerance |
| NA | 3 | NA | 0m | Tolerance |
| OtherResponse | Audience | IDIndividual1 | IntruderID |
|---|---|---|---|
| NA | Obse; Oup; Sirk | NA | NA |
| NA | Obse; Oup; Sirk | NA | NA |
| NA | Oup; Sirk | NA | NA |
| NA | Sirk | NA | NA |
| NA | Sey; Sirk | NA | NA |
| NA | Sey; Sirk | NA | NA |
| Remarks |
|---|
| NA |
| NA |
| Nge box did not open because of the battery. Oerw vocalized to MA when he ap to the box to open it. |
| Sey came to the boxes once they were open |
| NA |
| NA |
| Observers | DeviceId |
|---|---|
| Josefien; Michael; Ona; Zonke | {7A4E6639-7387-7648-88EC-7FD27A0F258A} |
| Josefien; Michael; Ona; Zonke | {7A4E6639-7387-7648-88EC-7FD27A0F258A} |
| Josefien; Michael; Ona; Zonke | {7A4E6639-7387-7648-88EC-7FD27A0F258A} |
| Josefien; Michael; Ona; Zonke | {7A4E6639-7387-7648-88EC-7FD27A0F258A} |
| Josefien; Michael; Ona; Zonke | {7A4E6639-7387-7648-88EC-7FD27A0F258A} |
| Josefien; Michael; Ona; Zonke | {7A4E6639-7387-7648-88EC-7FD27A0F258A} |
```beforeNA